EDA of Loan Data from Prosper

by Rawan Almohimeed

The dataset I will be exploring in this report is Prosper Loans Data. Prosper is a peer-to-peer lending marketplace, with over $7 billion in funded loans.[1] Borrowers request personal loans on Prosper and investors (individual or institutional) can fund anywhere from $2,000 to $35,000 per loan request. Investors can consider borrowers’ credit scores, ratings, and histories and the category of the loan. Prosper handles the servicing of the loan and collects and distributes borrower payments and interest back to the loan investors.[2]

Through this EDA I will explore the structure of the data, and will try and answer some questions, such as, what are the characteristics of borrowers and loans? How does this marketplace evalutes loans and estimates risk? And other questions as I go through this analysis.

Univariate Plots Section

In this section I will explore the dataset structure and provide some plots and statistics.

Row count, starcutre and summary of the whole dataset:

## [1] 113937
## 'data.frame':    113937 obs. of  81 variables:
##  $ ListingKey                         : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...
##  $ ListingNumber                      : int  193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
##  $ ListingCreationDate                : Factor w/ 113064 levels "2005-11-09 20:44:28.847000000",..: 14184 111894 6429 64760 85967 100310 72556 74019 97834 97834 ...
##  $ CreditGrade                        : Factor w/ 9 levels "","A","AA","B",..: 5 1 8 1 1 1 1 1 1 1 ...
##  $ Term                               : int  36 36 36 36 36 60 36 36 36 36 ...
##  $ LoanStatus                         : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
##  $ ClosedDate                         : Factor w/ 2803 levels "","2005-11-25 00:00:00",..: 1138 1 1263 1 1 1 1 1 1 1 ...
##  $ BorrowerAPR                        : num  0.165 0.12 0.283 0.125 0.246 ...
##  $ BorrowerRate                       : num  0.158 0.092 0.275 0.0974 0.2085 ...
##  $ LenderYield                        : num  0.138 0.082 0.24 0.0874 0.1985 ...
##  $ EstimatedEffectiveYield            : num  NA 0.0796 NA 0.0849 0.1832 ...
##  $ EstimatedLoss                      : num  NA 0.0249 NA 0.0249 0.0925 ...
##  $ EstimatedReturn                    : num  NA 0.0547 NA 0.06 0.0907 ...
##  $ ProsperRating..numeric.            : int  NA 6 NA 6 3 5 2 4 7 7 ...
##  $ ProsperRating..Alpha.              : Factor w/ 8 levels "","A","AA","B",..: 1 2 1 2 6 4 7 5 3 3 ...
##  $ ProsperScore                       : num  NA 7 NA 9 4 10 2 4 9 11 ...
##  $ ListingCategory..numeric.          : int  0 2 0 16 2 1 1 2 7 7 ...
##  $ BorrowerState                      : Factor w/ 52 levels "","AK","AL","AR",..: 7 7 12 12 25 34 18 6 16 16 ...
##  $ Occupation                         : Factor w/ 68 levels "","Accountant/CPA",..: 37 43 37 52 21 43 50 29 24 24 ...
##  $ EmploymentStatus                   : Factor w/ 9 levels "","Employed",..: 9 2 4 2 2 2 2 2 2 2 ...
##  $ EmploymentStatusDuration           : int  2 44 NA 113 44 82 172 103 269 269 ...
##  $ IsBorrowerHomeowner                : Factor w/ 2 levels "False","True": 2 1 1 2 2 2 1 1 2 2 ...
##  $ CurrentlyInGroup                   : Factor w/ 2 levels "False","True": 2 1 2 1 1 1 1 1 1 1 ...
##  $ GroupKey                           : Factor w/ 707 levels "","00343376901312423168731",..: 1 1 335 1 1 1 1 1 1 1 ...
##  $ DateCreditPulled                   : Factor w/ 112992 levels "2005-11-09 00:30:04.487000000",..: 14347 111883 6446 64724 85857 100382 72500 73937 97888 97888 ...
##  $ CreditScoreRangeLower              : int  640 680 480 800 680 740 680 700 820 820 ...
##  $ CreditScoreRangeUpper              : int  659 699 499 819 699 759 699 719 839 839 ...
##  $ FirstRecordedCreditLine            : Factor w/ 11586 levels "","1947-08-24 00:00:00",..: 8639 6617 8927 2247 9498 497 8265 7685 5543 5543 ...
##  $ CurrentCreditLines                 : int  5 14 NA 5 19 21 10 6 17 17 ...
##  $ OpenCreditLines                    : int  4 14 NA 5 19 17 7 6 16 16 ...
##  $ TotalCreditLinespast7years         : int  12 29 3 29 49 49 20 10 32 32 ...
##  $ OpenRevolvingAccounts              : int  1 13 0 7 6 13 6 5 12 12 ...
##  $ OpenRevolvingMonthlyPayment        : num  24 389 0 115 220 1410 214 101 219 219 ...
##  $ InquiriesLast6Months               : int  3 3 0 0 1 0 0 3 1 1 ...
##  $ TotalInquiries                     : num  3 5 1 1 9 2 0 16 6 6 ...
##  $ CurrentDelinquencies               : int  2 0 1 4 0 0 0 0 0 0 ...
##  $ AmountDelinquent                   : num  472 0 NA 10056 0 ...
##  $ DelinquenciesLast7Years            : int  4 0 0 14 0 0 0 0 0 0 ...
##  $ PublicRecordsLast10Years           : int  0 1 0 0 0 0 0 1 0 0 ...
##  $ PublicRecordsLast12Months          : int  0 0 NA 0 0 0 0 0 0 0 ...
##  $ RevolvingCreditBalance             : num  0 3989 NA 1444 6193 ...
##  $ BankcardUtilization                : num  0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
##  $ AvailableBankcardCredit            : num  1500 10266 NA 30754 695 ...
##  $ TotalTrades                        : num  11 29 NA 26 39 47 16 10 29 29 ...
##  $ TradesNeverDelinquent..percentage. : num  0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...
##  $ TradesOpenedLast6Months            : num  0 2 NA 0 2 0 0 0 1 1 ...
##  $ DebtToIncomeRatio                  : num  0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
##  $ IncomeRange                        : Factor w/ 8 levels "$0","$1-24,999",..: 4 5 7 4 3 3 4 4 4 4 ...
##  $ IncomeVerifiable                   : Factor w/ 2 levels "False","True": 2 2 2 2 2 2 2 2 2 2 ...
##  $ StatedMonthlyIncome                : num  3083 6125 2083 2875 9583 ...
##  $ LoanKey                            : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...
##  $ TotalProsperLoans                  : int  NA NA NA NA 1 NA NA NA NA NA ...
##  $ TotalProsperPaymentsBilled         : int  NA NA NA NA 11 NA NA NA NA NA ...
##  $ OnTimeProsperPayments              : int  NA NA NA NA 11 NA NA NA NA NA ...
##  $ ProsperPaymentsLessThanOneMonthLate: int  NA NA NA NA 0 NA NA NA NA NA ...
##  $ ProsperPaymentsOneMonthPlusLate    : int  NA NA NA NA 0 NA NA NA NA NA ...
##  $ ProsperPrincipalBorrowed           : num  NA NA NA NA 11000 NA NA NA NA NA ...
##  $ ProsperPrincipalOutstanding        : num  NA NA NA NA 9948 ...
##  $ ScorexChangeAtTimeOfListing        : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ LoanCurrentDaysDelinquent          : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ LoanFirstDefaultedCycleNumber      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ LoanMonthsSinceOrigination         : int  78 0 86 16 6 3 11 10 3 3 ...
##  $ LoanNumber                         : int  19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...
##  $ LoanOriginalAmount                 : int  9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
##  $ LoanOriginationDate                : Factor w/ 1873 levels "2005-11-15 00:00:00",..: 426 1866 260 1535 1757 1821 1649 1666 1813 1813 ...
##  $ LoanOriginationQuarter             : Factor w/ 33 levels "Q1 2006","Q1 2007",..: 18 8 2 32 24 33 16 16 33 33 ...
##  $ MemberKey                          : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...
##  $ MonthlyLoanPayment                 : num  330 319 123 321 564 ...
##  $ LP_CustomerPayments                : num  11396 0 4187 5143 2820 ...
##  $ LP_CustomerPrincipalPayments       : num  9425 0 3001 4091 1563 ...
##  $ LP_InterestandFees                 : num  1971 0 1186 1052 1257 ...
##  $ LP_ServiceFees                     : num  -133.2 0 -24.2 -108 -60.3 ...
##  $ LP_CollectionFees                  : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_GrossPrincipalLoss              : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_NetPrincipalLoss                : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_NonPrincipalRecoverypayments    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ PercentFunded                      : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ Recommendations                    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ InvestmentFromFriendsCount         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ InvestmentFromFriendsAmount        : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Investors                          : int  258 1 41 158 20 1 1 1 1 1 ...
##  [1] "ListingKey"                         
##  [2] "ListingNumber"                      
##  [3] "ListingCreationDate"                
##  [4] "CreditGrade"                        
##  [5] "Term"                               
##  [6] "LoanStatus"                         
##  [7] "ClosedDate"                         
##  [8] "BorrowerAPR"                        
##  [9] "BorrowerRate"                       
## [10] "LenderYield"                        
## [11] "EstimatedEffectiveYield"            
## [12] "EstimatedLoss"                      
## [13] "EstimatedReturn"                    
## [14] "ProsperRating..numeric."            
## [15] "ProsperRating..Alpha."              
## [16] "ProsperScore"                       
## [17] "ListingCategory..numeric."          
## [18] "BorrowerState"                      
## [19] "Occupation"                         
## [20] "EmploymentStatus"                   
## [21] "EmploymentStatusDuration"           
## [22] "IsBorrowerHomeowner"                
## [23] "CurrentlyInGroup"                   
## [24] "GroupKey"                           
## [25] "DateCreditPulled"                   
## [26] "CreditScoreRangeLower"              
## [27] "CreditScoreRangeUpper"              
## [28] "FirstRecordedCreditLine"            
## [29] "CurrentCreditLines"                 
## [30] "OpenCreditLines"                    
## [31] "TotalCreditLinespast7years"         
## [32] "OpenRevolvingAccounts"              
## [33] "OpenRevolvingMonthlyPayment"        
## [34] "InquiriesLast6Months"               
## [35] "TotalInquiries"                     
## [36] "CurrentDelinquencies"               
## [37] "AmountDelinquent"                   
## [38] "DelinquenciesLast7Years"            
## [39] "PublicRecordsLast10Years"           
## [40] "PublicRecordsLast12Months"          
## [41] "RevolvingCreditBalance"             
## [42] "BankcardUtilization"                
## [43] "AvailableBankcardCredit"            
## [44] "TotalTrades"                        
## [45] "TradesNeverDelinquent..percentage." 
## [46] "TradesOpenedLast6Months"            
## [47] "DebtToIncomeRatio"                  
## [48] "IncomeRange"                        
## [49] "IncomeVerifiable"                   
## [50] "StatedMonthlyIncome"                
## [51] "LoanKey"                            
## [52] "TotalProsperLoans"                  
## [53] "TotalProsperPaymentsBilled"         
## [54] "OnTimeProsperPayments"              
## [55] "ProsperPaymentsLessThanOneMonthLate"
## [56] "ProsperPaymentsOneMonthPlusLate"    
## [57] "ProsperPrincipalBorrowed"           
## [58] "ProsperPrincipalOutstanding"        
## [59] "ScorexChangeAtTimeOfListing"        
## [60] "LoanCurrentDaysDelinquent"          
## [61] "LoanFirstDefaultedCycleNumber"      
## [62] "LoanMonthsSinceOrigination"         
## [63] "LoanNumber"                         
## [64] "LoanOriginalAmount"                 
## [65] "LoanOriginationDate"                
## [66] "LoanOriginationQuarter"             
## [67] "MemberKey"                          
## [68] "MonthlyLoanPayment"                 
## [69] "LP_CustomerPayments"                
## [70] "LP_CustomerPrincipalPayments"       
## [71] "LP_InterestandFees"                 
## [72] "LP_ServiceFees"                     
## [73] "LP_CollectionFees"                  
## [74] "LP_GrossPrincipalLoss"              
## [75] "LP_NetPrincipalLoss"                
## [76] "LP_NonPrincipalRecoverypayments"    
## [77] "PercentFunded"                      
## [78] "Recommendations"                    
## [79] "InvestmentFromFriendsCount"         
## [80] "InvestmentFromFriendsAmount"        
## [81] "Investors"
##                    ListingKey     ListingNumber    
##  17A93590655669644DB4C06:     6   Min.   :      4  
##  349D3587495831350F0F648:     4   1st Qu.: 400919  
##  47C1359638497431975670B:     4   Median : 600554  
##  8474358854651984137201C:     4   Mean   : 627886  
##  DE8535960513435199406CE:     4   3rd Qu.: 892634  
##  04C13599434217079754AEE:     3   Max.   :1255725  
##  (Other)                :113912                    
##                     ListingCreationDate  CreditGrade         Term      
##  2013-10-02 17:20:16.550000000:     6          :84984   Min.   :12.00  
##  2013-08-28 20:31:41.107000000:     4   C      : 5649   1st Qu.:36.00  
##  2013-09-08 09:27:44.853000000:     4   D      : 5153   Median :36.00  
##  2013-12-06 05:43:13.830000000:     4   B      : 4389   Mean   :40.83  
##  2013-12-06 11:44:58.283000000:     4   AA     : 3509   3rd Qu.:36.00  
##  2013-08-21 07:25:22.360000000:     3   HR     : 3508   Max.   :60.00  
##  (Other)                      :113912   (Other): 6745                  
##                  LoanStatus                  ClosedDate   
##  Current              :56576                      :58848  
##  Completed            :38074   2014-03-04 00:00:00:  105  
##  Chargedoff           :11992   2014-02-19 00:00:00:  100  
##  Defaulted            : 5018   2014-02-11 00:00:00:   92  
##  Past Due (1-15 days) :  806   2012-10-30 00:00:00:   81  
##  Past Due (31-60 days):  363   2013-02-26 00:00:00:   78  
##  (Other)              : 1108   (Other)            :54633  
##   BorrowerAPR       BorrowerRate     LenderYield     
##  Min.   :0.00653   Min.   :0.0000   Min.   :-0.0100  
##  1st Qu.:0.15629   1st Qu.:0.1340   1st Qu.: 0.1242  
##  Median :0.20976   Median :0.1840   Median : 0.1730  
##  Mean   :0.21883   Mean   :0.1928   Mean   : 0.1827  
##  3rd Qu.:0.28381   3rd Qu.:0.2500   3rd Qu.: 0.2400  
##  Max.   :0.51229   Max.   :0.4975   Max.   : 0.4925  
##  NA's   :25                                          
##  EstimatedEffectiveYield EstimatedLoss   EstimatedReturn 
##  Min.   :-0.183          Min.   :0.005   Min.   :-0.183  
##  1st Qu.: 0.116          1st Qu.:0.042   1st Qu.: 0.074  
##  Median : 0.162          Median :0.072   Median : 0.092  
##  Mean   : 0.169          Mean   :0.080   Mean   : 0.096  
##  3rd Qu.: 0.224          3rd Qu.:0.112   3rd Qu.: 0.117  
##  Max.   : 0.320          Max.   :0.366   Max.   : 0.284  
##  NA's   :29084           NA's   :29084   NA's   :29084   
##  ProsperRating..numeric. ProsperRating..Alpha.  ProsperScore  
##  Min.   :1.000                  :29084         Min.   : 1.00  
##  1st Qu.:3.000           C      :18345         1st Qu.: 4.00  
##  Median :4.000           B      :15581         Median : 6.00  
##  Mean   :4.072           A      :14551         Mean   : 5.95  
##  3rd Qu.:5.000           D      :14274         3rd Qu.: 8.00  
##  Max.   :7.000           E      : 9795         Max.   :11.00  
##  NA's   :29084           (Other):12307         NA's   :29084  
##  ListingCategory..numeric. BorrowerState  
##  Min.   : 0.000            CA     :14717  
##  1st Qu.: 1.000            TX     : 6842  
##  Median : 1.000            NY     : 6729  
##  Mean   : 2.774            FL     : 6720  
##  3rd Qu.: 3.000            IL     : 5921  
##  Max.   :20.000                   : 5515  
##                            (Other):67493  
##                     Occupation         EmploymentStatus
##  Other                   :28617   Employed     :67322  
##  Professional            :13628   Full-time    :26355  
##  Computer Programmer     : 4478   Self-employed: 6134  
##  Executive               : 4311   Not available: 5347  
##  Teacher                 : 3759   Other        : 3806  
##  Administrative Assistant: 3688                : 2255  
##  (Other)                 :55456   (Other)      : 2718  
##  EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
##  Min.   :  0.00           False:56459         False:101218    
##  1st Qu.: 26.00           True :57478         True : 12719    
##  Median : 67.00                                               
##  Mean   : 96.07                                               
##  3rd Qu.:137.00                                               
##  Max.   :755.00                                               
##  NA's   :7625                                                 
##                     GroupKey                 DateCreditPulled 
##                         :100596   2013-12-23 09:38:12:     6  
##  783C3371218786870A73D20:  1140   2013-11-21 09:09:41:     4  
##  3D4D3366260257624AB272D:   916   2013-12-06 05:43:16:     4  
##  6A3B336601725506917317E:   698   2014-01-14 20:17:49:     4  
##  FEF83377364176536637E50:   611   2014-02-09 12:14:41:     4  
##  C9643379247860156A00EC0:   342   2013-09-27 22:04:54:     3  
##  (Other)                :  9634   (Other)            :113912  
##  CreditScoreRangeLower CreditScoreRangeUpper
##  Min.   :  0.0         Min.   : 19.0        
##  1st Qu.:660.0         1st Qu.:679.0        
##  Median :680.0         Median :699.0        
##  Mean   :685.6         Mean   :704.6        
##  3rd Qu.:720.0         3rd Qu.:739.0        
##  Max.   :880.0         Max.   :899.0        
##  NA's   :591           NA's   :591          
##         FirstRecordedCreditLine CurrentCreditLines OpenCreditLines
##                     :   697     Min.   : 0.00      Min.   : 0.00  
##  1993-12-01 00:00:00:   185     1st Qu.: 7.00      1st Qu.: 6.00  
##  1994-11-01 00:00:00:   178     Median :10.00      Median : 9.00  
##  1995-11-01 00:00:00:   168     Mean   :10.32      Mean   : 9.26  
##  1990-04-01 00:00:00:   161     3rd Qu.:13.00      3rd Qu.:12.00  
##  1995-03-01 00:00:00:   159     Max.   :59.00      Max.   :54.00  
##  (Other)            :112389     NA's   :7604       NA's   :7604   
##  TotalCreditLinespast7years OpenRevolvingAccounts
##  Min.   :  2.00             Min.   : 0.00        
##  1st Qu.: 17.00             1st Qu.: 4.00        
##  Median : 25.00             Median : 6.00        
##  Mean   : 26.75             Mean   : 6.97        
##  3rd Qu.: 35.00             3rd Qu.: 9.00        
##  Max.   :136.00             Max.   :51.00        
##  NA's   :697                                     
##  OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries   
##  Min.   :    0.0             Min.   :  0.000      Min.   :  0.000  
##  1st Qu.:  114.0             1st Qu.:  0.000      1st Qu.:  2.000  
##  Median :  271.0             Median :  1.000      Median :  4.000  
##  Mean   :  398.3             Mean   :  1.435      Mean   :  5.584  
##  3rd Qu.:  525.0             3rd Qu.:  2.000      3rd Qu.:  7.000  
##  Max.   :14985.0             Max.   :105.000      Max.   :379.000  
##                              NA's   :697          NA's   :1159     
##  CurrentDelinquencies AmountDelinquent   DelinquenciesLast7Years
##  Min.   : 0.0000      Min.   :     0.0   Min.   : 0.000         
##  1st Qu.: 0.0000      1st Qu.:     0.0   1st Qu.: 0.000         
##  Median : 0.0000      Median :     0.0   Median : 0.000         
##  Mean   : 0.5921      Mean   :   984.5   Mean   : 4.155         
##  3rd Qu.: 0.0000      3rd Qu.:     0.0   3rd Qu.: 3.000         
##  Max.   :83.0000      Max.   :463881.0   Max.   :99.000         
##  NA's   :697          NA's   :7622       NA's   :990            
##  PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
##  Min.   : 0.0000          Min.   : 0.000            Min.   :      0       
##  1st Qu.: 0.0000          1st Qu.: 0.000            1st Qu.:   3121       
##  Median : 0.0000          Median : 0.000            Median :   8549       
##  Mean   : 0.3126          Mean   : 0.015            Mean   :  17599       
##  3rd Qu.: 0.0000          3rd Qu.: 0.000            3rd Qu.:  19521       
##  Max.   :38.0000          Max.   :20.000            Max.   :1435667       
##  NA's   :697              NA's   :7604              NA's   :7604          
##  BankcardUtilization AvailableBankcardCredit  TotalTrades    
##  Min.   :0.000       Min.   :     0          Min.   :  0.00  
##  1st Qu.:0.310       1st Qu.:   880          1st Qu.: 15.00  
##  Median :0.600       Median :  4100          Median : 22.00  
##  Mean   :0.561       Mean   : 11210          Mean   : 23.23  
##  3rd Qu.:0.840       3rd Qu.: 13180          3rd Qu.: 30.00  
##  Max.   :5.950       Max.   :646285          Max.   :126.00  
##  NA's   :7604        NA's   :7544            NA's   :7544    
##  TradesNeverDelinquent..percentage. TradesOpenedLast6Months
##  Min.   :0.000                      Min.   : 0.000         
##  1st Qu.:0.820                      1st Qu.: 0.000         
##  Median :0.940                      Median : 0.000         
##  Mean   :0.886                      Mean   : 0.802         
##  3rd Qu.:1.000                      3rd Qu.: 1.000         
##  Max.   :1.000                      Max.   :20.000         
##  NA's   :7544                       NA's   :7544           
##  DebtToIncomeRatio         IncomeRange    IncomeVerifiable
##  Min.   : 0.000    $25,000-49,999:32192   False:  8669    
##  1st Qu.: 0.140    $50,000-74,999:31050   True :105268    
##  Median : 0.220    $100,000+     :17337                   
##  Mean   : 0.276    $75,000-99,999:16916                   
##  3rd Qu.: 0.320    Not displayed : 7741                   
##  Max.   :10.010    $1-24,999     : 7274                   
##  NA's   :8554      (Other)       : 1427                   
##  StatedMonthlyIncome                    LoanKey       TotalProsperLoans
##  Min.   :      0     CB1B37030986463208432A1:     6   Min.   :0.00     
##  1st Qu.:   3200     2DEE3698211017519D7333F:     4   1st Qu.:1.00     
##  Median :   4667     9F4B37043517554537C364C:     4   Median :1.00     
##  Mean   :   5608     D895370150591392337ED6D:     4   Mean   :1.42     
##  3rd Qu.:   6825     E6FB37073953690388BC56D:     4   3rd Qu.:2.00     
##  Max.   :1750003     0D8F37036734373301ED419:     3   Max.   :8.00     
##                      (Other)                :113912   NA's   :91852    
##  TotalProsperPaymentsBilled OnTimeProsperPayments
##  Min.   :  0.00             Min.   :  0.00       
##  1st Qu.:  9.00             1st Qu.:  9.00       
##  Median : 16.00             Median : 15.00       
##  Mean   : 22.93             Mean   : 22.27       
##  3rd Qu.: 33.00             3rd Qu.: 32.00       
##  Max.   :141.00             Max.   :141.00       
##  NA's   :91852              NA's   :91852        
##  ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
##  Min.   : 0.00                       Min.   : 0.00                  
##  1st Qu.: 0.00                       1st Qu.: 0.00                  
##  Median : 0.00                       Median : 0.00                  
##  Mean   : 0.61                       Mean   : 0.05                  
##  3rd Qu.: 0.00                       3rd Qu.: 0.00                  
##  Max.   :42.00                       Max.   :21.00                  
##  NA's   :91852                       NA's   :91852                  
##  ProsperPrincipalBorrowed ProsperPrincipalOutstanding
##  Min.   :    0            Min.   :    0              
##  1st Qu.: 3500            1st Qu.:    0              
##  Median : 6000            Median : 1627              
##  Mean   : 8472            Mean   : 2930              
##  3rd Qu.:11000            3rd Qu.: 4127              
##  Max.   :72499            Max.   :23451              
##  NA's   :91852            NA's   :91852              
##  ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
##  Min.   :-209.00             Min.   :   0.0           
##  1st Qu.: -35.00             1st Qu.:   0.0           
##  Median :  -3.00             Median :   0.0           
##  Mean   :  -3.22             Mean   : 152.8           
##  3rd Qu.:  25.00             3rd Qu.:   0.0           
##  Max.   : 286.00             Max.   :2704.0           
##  NA's   :95009                                        
##  LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination   LoanNumber    
##  Min.   : 0.00                 Min.   :  0.0              Min.   :     1  
##  1st Qu.: 9.00                 1st Qu.:  6.0              1st Qu.: 37332  
##  Median :14.00                 Median : 21.0              Median : 68599  
##  Mean   :16.27                 Mean   : 31.9              Mean   : 69444  
##  3rd Qu.:22.00                 3rd Qu.: 65.0              3rd Qu.:101901  
##  Max.   :44.00                 Max.   :100.0              Max.   :136486  
##  NA's   :96985                                                            
##  LoanOriginalAmount          LoanOriginationDate LoanOriginationQuarter
##  Min.   : 1000      2014-01-22 00:00:00:   491   Q4 2013:14450         
##  1st Qu.: 4000      2013-11-13 00:00:00:   490   Q1 2014:12172         
##  Median : 6500      2014-02-19 00:00:00:   439   Q3 2013: 9180         
##  Mean   : 8337      2013-10-16 00:00:00:   434   Q2 2013: 7099         
##  3rd Qu.:12000      2014-01-28 00:00:00:   339   Q3 2012: 5632         
##  Max.   :35000      2013-09-24 00:00:00:   316   Q2 2012: 5061         
##                     (Other)            :111428   (Other):60343         
##                    MemberKey      MonthlyLoanPayment LP_CustomerPayments
##  63CA34120866140639431C9:     9   Min.   :   0.0     Min.   :   -2.35   
##  16083364744933457E57FB9:     8   1st Qu.: 131.6     1st Qu.: 1005.76   
##  3A2F3380477699707C81385:     8   Median : 217.7     Median : 2583.83   
##  4D9C3403302047712AD0CDD:     8   Mean   : 272.5     Mean   : 4183.08   
##  739C338135235294782AE75:     8   3rd Qu.: 371.6     3rd Qu.: 5548.40   
##  7E1733653050264822FAA3D:     8   Max.   :2251.5     Max.   :40702.39   
##  (Other)                :113888                                         
##  LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees   
##  Min.   :    0.0              Min.   :   -2.35   Min.   :-664.87  
##  1st Qu.:  500.9              1st Qu.:  274.87   1st Qu.: -73.18  
##  Median : 1587.5              Median :  700.84   Median : -34.44  
##  Mean   : 3105.5              Mean   : 1077.54   Mean   : -54.73  
##  3rd Qu.: 4000.0              3rd Qu.: 1458.54   3rd Qu.: -13.92  
##  Max.   :35000.0              Max.   :15617.03   Max.   :  32.06  
##                                                                   
##  LP_CollectionFees  LP_GrossPrincipalLoss LP_NetPrincipalLoss
##  Min.   :-9274.75   Min.   :  -94.2       Min.   : -954.5    
##  1st Qu.:    0.00   1st Qu.:    0.0       1st Qu.:    0.0    
##  Median :    0.00   Median :    0.0       Median :    0.0    
##  Mean   :  -14.24   Mean   :  700.4       Mean   :  681.4    
##  3rd Qu.:    0.00   3rd Qu.:    0.0       3rd Qu.:    0.0    
##  Max.   :    0.00   Max.   :25000.0       Max.   :25000.0    
##                                                              
##  LP_NonPrincipalRecoverypayments PercentFunded    Recommendations   
##  Min.   :    0.00                Min.   :0.7000   Min.   : 0.00000  
##  1st Qu.:    0.00                1st Qu.:1.0000   1st Qu.: 0.00000  
##  Median :    0.00                Median :1.0000   Median : 0.00000  
##  Mean   :   25.14                Mean   :0.9986   Mean   : 0.04803  
##  3rd Qu.:    0.00                3rd Qu.:1.0000   3rd Qu.: 0.00000  
##  Max.   :21117.90                Max.   :1.0125   Max.   :39.00000  
##                                                                     
##  InvestmentFromFriendsCount InvestmentFromFriendsAmount   Investors      
##  Min.   : 0.00000           Min.   :    0.00            Min.   :   1.00  
##  1st Qu.: 0.00000           1st Qu.:    0.00            1st Qu.:   2.00  
##  Median : 0.00000           Median :    0.00            Median :  44.00  
##  Mean   : 0.02346           Mean   :   16.55            Mean   :  80.48  
##  3rd Qu.: 0.00000           3rd Qu.:    0.00            3rd Qu.: 115.00  
##  Max.   :33.00000           Max.   :25000.00            Max.   :1189.00  
## 

Credit Grade and Prosper Rating:

These two variables, based on the variable descritiopn represent the same rating but for listing prior to 2009, and after 2009, respectively. As we can see the rating has normal disrubution.

Term:

## [1] 1.416572
##    12    36    60 
##  1614 87778 24545

About ~ 77% of loans are set for three years, we can further explore how this relates to loan amount and/or borrower income. It suprised me to notice some loans that are only 12 months long!

Loan Status:

As we can see there is good percentage of completed loans, which is a good factor to consider the effectiveness of Prosper marketplace.

Loan Amount:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    4000    6500    8337   12000   35000
##           n
## 1 0.5968211

It’s clear that most loans fall into theses values (~4000, ~10000, ~15000), and loans higher than 25000 only represent a very small percentage (~0.5%) of all loans.

Borrower Employment Status:

##                    Employed     Full-time Not available  Not employed 
##          2255         67322         26355          5347           835 
##         Other     Part-time       Retired Self-employed 
##          3806          1088           795          6134

Most loans are made by employeed with (1-2) years of employment. We can later explore this in relation to loan amount/ income/ loan category.

Category:

##      Not Available Debt Consolidation   Home Improvement 
##              16965              58308               7433 
##           Business      Personal Loan        Student Use 
##               7189               2395                756 
##               Auto              Other      Baby&Adoption 
##               2572              10494                199 
##               Boat Cosmetic Procedure    Engagement Ring 
##                 85                 91                217 
##        Green Loans Household Expenses    Large Purchases 
##                 59               1996                876 
##     Medical/Dental         Motorcycle                 RV 
##               1522                304                 52 
##              Taxes           Vacation      Wedding Loans 
##                885                768                771

It’s clear that most loans are for Dept Consolidation, I expected to see more variation between categories, for example: students loans, home improvement, etc. Nonetheless the dataset isn’t very informative we have many records with no available category info, as well as Other, which could be anything!

Income Range and Monthly Income:

## [1] "$0"             "$1-24,999"      "$100,000+"      "$25,000-49,999"
## [5] "$50,000-74,999" "$75,000-99,999" "Not displayed"  "Not employed"

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0    3200    4667    5608    6825 1750003

As we can see the income range has normal distibution, and the average monthly income falls around (4000-5000) which is consistant with income range. I have detected some outliers in the monthly income so I had to change the limits of my plot to get better representation of the data.

Dept to Income Ratio:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.000   0.140   0.220   0.276   0.320  10.010    8554

We can strongly say that the DTI ratio is small, which is good factor for investors. There are some detected outlier so it will be removed.

Lender Yield:

Borrower Rate:

These two histograms show the investor yields, and borrower rate, we will further explore these variables in relation to other factors as well.

Number of Investores:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00    2.00   44.00   80.48  115.00 1189.00
## [1] "Number of loans backed by one invertor:"
## [1] 27814
## [1] "Number and percentage of loans backed by one invertor:"
## [1] 86123
## [1] 75.58826

It’s interesting to discover that around 75.5% of loans are backed by more than one invesitor. We can further explore this in relation with other factors. I will also explore loans that are backed by only one investor.

Prosper Risk Assessment:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    1.00    4.00    6.00    5.95    8.00   11.00   29084

This assessment is based on Prosper own historical data, with 10 being the best, or lowest risk score, it has normal distibution.

Loan Originated by Year:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2005    2008    2012    2011    2013    2014

As we can see there’s visible increase by each year, except for (2014), also there are some drops in years (2009-2011)!

Univariate Analysis

In this section I have explored the dataset and set some questions to answer through the analysis,

In the next section will further explore the dataset based on two relating variables.

What is the structure of your dataset?

As we have previously explored the structure of Prosper dataset. It can be viewed from two aspects relating to borrower, and loans. Borrower information which includes data provided by the user along with information from credit beruea. Loans data includes details for each loan which will be valueble in understanding the perfromance and effectivity of this marketplace.

What is/are the main feature(s) of interest in your dataset?

The main features of interests are:

  • Credit Score, and Prosper Rating

  • Loan Amount.

  • Monthly Payment.

  • Employemnt Status and Duration.

  • Borrower State.

  • Risk Assessment.

  • Loan Category.

  • Borrower Rate.

  • Investor Yield.

  • Loan Term.

  • Income Range and Monthly Income.

  • Number of Investors.

I also want to explore more variables in the next section that may have relations with Prosper Score.

What other features in the dataset do you think will help support your
investigation into your feature(s) of interest?

It would be even more interesting to have data on investors, and exploreraltions based on their prospective.

Did you create any new variables from existing variables in the dataset?

New feature has been created (Year) from Listing Creation Date, to give better understanding of the dataset.

Of the features you investigated, were there any unusual distributions?
Did you perform any operations on the data to tidy, adjust, or change the form
of the data? If so, why did you do this?

I have performed some modification on some variables to better understand and represent the data, all have been explianed under each plot related.

Bivariate Plots Section

In this section we will take the analysis to deeper level and try to explore relationships and correlations between variables.

Correlation Matrix:

How does Prosper assess loans and score risk?

One major factor in Prosper Marketplace is how it assesses loans, as it affects the estimated loss and borrower score, thus determines the borrower rate and lender yield. In the previous two plots is summerizes how this was approched and its affects.

Prosper defines this score as: “A custom risk score built using historical Prosper data. The score ranges from 1-10, with 10 being the best, or lowest risk score. Applicable for loans originated after July 2009.”

Acorrding to Prosper[3,4] it has built its own rick matrics (Prosper Score) using historical Prosper data to assess the risk of Prosper borrower listings, they analyzed the similarity of loan chatactrastics and how it performed. In addition to that, Prosper has reviewed several potintial factors from loan listing details provided by borrowers, and reporting credit bureau to build its score. Prospwe finally determined the ones to use and keep, these can be viewed from the previous plots:

It is negatively correlated with: Number of inquiries, Number of delinquent accounts, Card utilization, Number of open trade lines, Dept to income ratio, which shows the highest correlation. Also to illustirate the importance of this score it is highly correlated with Prosper Rating and Estiemated Loss.

We can also see that it is correlated with Credit Score Range (Lower, and Upper) which represent FICO®08 score. This score is used to specify the estimated loss in addition to Prosper Score. From the correlation matrics we can see how each of these scores correlates with the other factors but on different level, which tells us that Prosper Score uses different approach including historic data.

Although not stated we can see clear correlation between the monthly income load amount and Prosper Score. We can also see correlation between Lender Yield and Borrower Rate, which is not suprising.

How the Term relates to Loan Amount?

## Term: 12
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    2000    3500    4694    5000   25000 
## -------------------------------------------------------- 
## Term: 36
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    3000    5000    7276   10000   35000 
## -------------------------------------------------------- 
## Term: 60
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    2000    8000   11500   12370   15000   35000

The higher loan amount tends to be in longer term. The median and mean are higher in 5-year duration than in 3-year. In the second plot we can see the distribution of loan amount based on each term.

Borrower State and Loan Amount:

This plot gives a presentation of how each state differs in terms of the amount of loans. We can see higher median in some states such as: California, New york, and Texas. Whereas some states have barely made any loans, for example: Alaska, and Maine, which can be reletive to the population and economy of that state.

Prosper Rating and Loan Amount:

As we have seen from the correlation matrix Prosper Rating correlates with the loan amount, this plot shows how the loan amount is affected by the rating.

Loan Amount and Category:

This plot shows the ditribution of loan amount by each category.

Employment Status and Loan Amount:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   0.000   2.167   5.583   8.006  11.417  62.917    7625

It is clear that higher loans are made by employed and self-employed. In the second plot we can the how Prosper Rating relates to employment status and its statistics. We can also see the statistics of employment duration based on employment status.

Monthly Payment and Loan Amount:

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0   131.6   217.7   272.5   371.6  2251.5
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    4000    6500    8337   12000   35000
## [1] "Percentage of loan payments below $500"
## [1] 87.27279
## [1] "Correlation"
## [1] 0.9319837

This plot shows a strong correlation between Loan amount and monthly payment, so as the loan amount increases the monthly payment increases. However around 87% of the loans have monthly payment below $500 regardless of the moan original amount. We can view this in relation to the Term of the laon and explore how it is affected.

Monthly Payment and Income Range:

## IncomeRange: $0
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   87.14  169.68  267.47  347.58 1130.90 
## -------------------------------------------------------- 
## IncomeRange: $1-24,999
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   86.38  134.28  154.67  173.71 1047.64 
## -------------------------------------------------------- 
## IncomeRange: $25,000-49,999
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0   118.9   173.7   210.4   282.8  1382.1 
## -------------------------------------------------------- 
## IncomeRange: $50,000-74,999
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0   155.8   253.1   280.3   383.7  1778.4 
## -------------------------------------------------------- 
## IncomeRange: $75,000-99,999
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0   169.5   301.6   329.4   457.2  2111.8 
## -------------------------------------------------------- 
## IncomeRange: $100,000+
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     0.0   208.6   375.0   412.2   560.1  2251.5 
## -------------------------------------------------------- 
## IncomeRange: Not displayed
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   76.62  122.48  182.36  217.77 1047.64 
## -------------------------------------------------------- 
## IncomeRange: Not employed
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00   90.28  169.62  183.79  217.74 1085.67

In this plot I wanted to explore the relation between monthly payment and income range. The mean and median for each range differs however it is stil below $500 as we have concluded from the previous plot.

Investors and Loan Amount:

## Investors: 1
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    5000   10000   11011   15000   35000

## List of 1
##  $ axis.text.x:List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : num 1
##   ..$ vjust        : NULL
##   ..$ angle        : num 70
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi FALSE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  - attr(*, "class")= chr [1:2] "theme" "gg"
##  - attr(*, "complete")= logi FALSE
##  - attr(*, "validate")= logi TRUE
## ManyInvestors: FALSE
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    5000   10000   11011   15000   35000 
## -------------------------------------------------------- 
## ManyInvestors: TRUE
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    3001    5000    7474   10000   35000

From these plots we can examine the relation between loan amount and investors, I wanted to break it down to two groups as we seen earlier in the univariate analysis that many loans are funded by only one investor.

The first scatter plot represents loans with more than one investor, a regression quantile lines are added, we can see that investors tends to increase as the loan amount increases. The second plot is only focused on loans funded by one investor based on Prosper Rating, I expected to see lower mean for the loan amount but that doesn’t seem to be the case, so I created a new variable to accuratly compare between the two cases. In the third plot we can see that that loans funded by only one investor have higher mean than many investors. This is interesting, I wonder what makes person invest like that? In the multivarita anylsis I will see if these loans have higher lender yield of not? Also how it relates to Term?

Bivariate Analysis

In this section I have explored the dataset from different aspects relating to two variables, and found some interesting relationships and insights about Prosper marketplace.

Talk about some of the relationships you observed in this part of the
investigation. How did the feature(s) of interest vary with other features in
the dataset?

There are many relationships among the features, the strongest relates to how Prosper assess loans and evaluates them accordingly. The rest of the observations were mainly to understand the charachterstisc and marketplace and its users.

Did you observe any interesting relationships between the other features
(not the main feature(s) of interest)?

Actually the relation between the number of investors and loan amount was quite unusual, although there are not many loans funded by only one investor, it is interesting to explore it.

What was the strongest relationship you found?

There are many strong correlations in this dataset, however the strongest meaningful relationship was between each of (Lender Yield, Borrower Rate, Estimated Loss) and Prosper Score.

Multivariate Plots Section

Investors and Loan Amount (Extended):

These plots serve as continuation to the bivariate analysis, as we have explored the relation between investors and loan amount. In the first plot I added the variable Term to show how the amount differs for each term, in the case of one investor the mean is higher.

The secon plot show the relation between investors and lender yield for each year. From this plot we can detect that loans funded by one investor did not start to increase until recently (2013-2014)!

Monthly Payment and Monthly Income by Year:

This plot gives an overview on the Income Range and Prosper Rating for each year, we can that higher incomes tend to have better rating, where there isn’t any HR for the two highest ranges in recent years. Also for lower ranges the rating isn’t biased, we can find A,AA ratings.

Monthly Payment and Loan Amount by Prosper Score:

This plot shows how prosper score affecting loan payment and loan amount.

Loan Amount by Year for Term:

This plot we can look at the mean of loans amount made by each year for each term. We can see that 12-month loans are no longer fount in 2014, also 60-month loans didn’t start until 2010. The 36-month seems to be most popular among other options.

Loan Amount, Monthly Payment based on Prosper Rating:

These plots show the distibution of Loans Amount and Monthly Payment in each year, color coded by Prosper Rating.

Multivariate Analysis

In this section I have compined the findings from previous sections and incorperated more variables.

Talk about some of the relationships you observed in this part of the
investigation. Were there features that strengthened each other in terms of
looking at your feature(s) of interest?

One major factor in the dataset is Prosper Score I have explored how it correlates with Loans Amount and Monthly Payments, it actually proofs the effectivity of this matrix.

Were there any interesting or surprising interactions between features?


Final Plots and Summary

Plot One

Description One

These plots give an overall view of Prosper marketplace. It shows the main characteristic of loans which is the loan amount of loans made, based on each year from 2005 - 2014, it also includes information regarding the term for those loans. We can see a lot from these simple plots, it shows how Prosper developed and evolved over time, also the value of it based on the loans amount being processed each year. The term information tells us how borrowers engage with the marketplace, and how much it is being trusted with their finances.

Plot Two

Description Two

Prosper is P2P marketplace, it allows many investor to fund part of loan. I noticed there were many loans funded by only one investor, these plots summarize my findings. It shows the distribution of loans amount and its statistics visualized in boxplot, the second one which important for investors shows the Lender Yield for each category, we can see in recent years there has been in increase in loans funded by one investor, maybe it proves the trust Prosper has developed with its investors now they invest in higher rates.

Plot Three

Description Three

Based on the observations made in the univariate analysis, Prosper Score is the main factor developed by Prosper to assess risk and therefore estimates loss on loans. We also have another variable used to identify the quality of loans which is Prosper Rating, it is proprietary to evaluate loans. These two factors are important for investors in order to make good decisions. There are more detailed factors considered when Prosper developed its Scoring matrix, so we can see how that changed overtime toward more accuracy.

In the first plot we can see in the first years most Rating has higher Score which mean lower risks! But it changed to be in consistent Score ranges. This can tell us how the matrix was developed on historic loans data.

The second plot shows the Estimated Loss for each year based on Prosper Score, we gain many insights from this plot, we can actually see improvements on the scoring and estimated loss, we can also see that the estimated loss is lowered by each year, which is a good indicator of the marketplace effectivity.


Reflection

Prosper dataset is very insightful, and contains loads of information, it took me some time to understand the meaning of the variables as I have never took a loan, or had any work experience in the same field, so it was all almost new to me. That wasn’t a set back at all, it just required more researching and more reading on the subject, and I am surprised by the amount of information I ended up personally, let alone the work done for this analysis.

Initially I wanted to understand the characteristics of borrowers so that was covered in the univariate analysis section, some of the findings were surprising to me, such as the Loan Category, Employment Status. Next, I went to explore what makes Prosper different than usual loan agencies, it was related to how it evaluates loans and risks, I showed this in the univariate analysis section. Finally, I combined all the work together and explored some interesting insights in the multivariate analysis section.

There were some limitations and obstacles I encountered during my analysis. I would want to have more information on investors, and explore the dataset from their perspective, whether they return for another investment, the actual return after the loan is complete.. etc, I believe it could be very valuable.

I wanted to explore the data related to borrowers who returned for another loan, and how their performance changes, I was also interested in showing how each state differs with respect to its population, but that would require looking for this data from other source, so for the scope of this analysis I decided to return to it some other time.

There are possible future work on this dataset, there are many regression models I can think of that would be suitable. Given some variables I already explored I want to build a model to assign the score for new loans. In addition to that we can further explore the data related to defaulted loans and develop a predictive model to detect these cases before they happen. With the data of investors we can build model to find the right investor for each loan and speedup the process.

Resouces:

[1] http://www.altfi.com/article/1639_prospers_2015_in_numbers

[2] https://en.wikipedia.org/wiki/Prosper_Marketplace#cite_note-2

[3] https://www.prosper.com/plp/general-prosper_score/

[4] https://www.prosper.com/plp/general-estimated_loss_rates/